How to Center Binary Deep Boltzmann Machines

نویسندگان

  • Jan Melchior
  • Asja Fischer
  • Laurenz Wiskott
چکیده

This work analyzes centered binary Restricted Boltzmann Machines (RBMs) and binary Deep Boltzmann Machines (DBMs), where centering is done by subtracting offset values from visible and hidden variables. We show analytically that (i) centering results in a different but equivalent parameterization for artificial neural networks in general, (ii) the expected performance of centered binary RBMs/DBMs is invariant under simultaneous flip of data and offsets, for any offset value in the range of zero to one, (iii) centering can be reformulated as a different update rule for normal binary RBMs/DBMs, and (iv) using the enhanced gradient is equivalent to setting the offset values to the average over model and data mean. Furthermore, numerical simulations suggest that (i) optimal generative performance is achieved by subtracting mean values from visible as well as hidden variables, (ii) centered RBMs/DBMs reach significantly higher log-likelihood values than normal binary RBMs/DBMs, (iii) centering variants whose offsets depend on the model mean, like the enhanced gradient, suffer from severe divergence problems, (iv) learning is stabilized if an exponentially moving average over the batch means is used for the offset values instead of the current batch mean, which also prevents the enhanced gradient from diverging, (v) centered RBMs/DBMs reach higher LL values than normal RBMs/DBMs while having a smaller norm of the weight matrix, (vi) centering leads to an update direction that is closer to the natural gradient and that the natural gradient is extremly efficient for training RBMs, (vii) centering dispense the need for greedy layer-wise pre-training of DBMs, (viii) furthermore we show that pre-training often even worsen the results independently whether centering is used or not, and (ix) centering is also beneficial for auto encoders.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How to Center Binary Restricted Boltzmann Machines

It has recently been shown that subtracting the mean from the visible as well as the hidden variables of deep Boltzmann machines leads to better conditioned optimization problems and improves some aspects of model performance. In this work we analyze binary restricted Boltzmann machines, where centering is done by subtracting offset values from visible and hidden variables. We show analytically...

متن کامل

Notes on Boltzmann Machines

I. INTRODUCTION Boltzmann machines are probability distributions on high dimensional binary vectors which are analogous to Gaussian Markov Random Fields in that they are fully determined by first and second order moments. A key difference however is that augmenting Boltzmann machines with hidden variables enlarges the class of distributions that can be modeled, so that in principle it is possib...

متن کامل

How to Center Deep Boltzmann Machines

This work analyzes centered Restricted Boltzmann Machines (RBMs) and centered Deep Boltzmann Machines (DBMs), where centering is done by subtracting offset values from visible and hidden variables. We show analytically that (i) centered and normal Boltzmann Machines (BMs) and thus RBMs and DBMs are different parameterizations of the same model class, such that any normal BM/RBM/DBM can be trans...

متن کامل

Neural Autoregressive Distribution Estimation

We present Neural Autoregressive Distribution Estimation (NADE) models, which are neural network architectures applied to the problem of unsupervised distribution and density esitmation. They leverage the probability product rule and a weight sharing scheme inspired from restricted Boltzmann machines, to yield an estimator that is both tractable and has good generalization performance. We discu...

متن کامل

Advances in Deep Learning

Deep neural networks have become increasingly more popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to the recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks hav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013